The loan data is from Prosper, This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information. The last date updated this data set is 3/11/2014.
## [1] 81
## [1] "ListingKey"
## [2] "ListingNumber"
## [3] "ListingCreationDate"
## [4] "CreditGrade"
## [5] "Term"
## [6] "LoanStatus"
## [7] "ClosedDate"
## [8] "BorrowerAPR"
## [9] "BorrowerRate"
## [10] "LenderYield"
## [11] "EstimatedEffectiveYield"
## [12] "EstimatedLoss"
## [13] "EstimatedReturn"
## [14] "ProsperRating..numeric."
## [15] "ProsperRating..Alpha."
## [16] "ProsperScore"
## [17] "ListingCategory..numeric."
## [18] "BorrowerState"
## [19] "Occupation"
## [20] "EmploymentStatus"
## [21] "EmploymentStatusDuration"
## [22] "IsBorrowerHomeowner"
## [23] "CurrentlyInGroup"
## [24] "GroupKey"
## [25] "DateCreditPulled"
## [26] "CreditScoreRangeLower"
## [27] "CreditScoreRangeUpper"
## [28] "FirstRecordedCreditLine"
## [29] "CurrentCreditLines"
## [30] "OpenCreditLines"
## [31] "TotalCreditLinespast7years"
## [32] "OpenRevolvingAccounts"
## [33] "OpenRevolvingMonthlyPayment"
## [34] "InquiriesLast6Months"
## [35] "TotalInquiries"
## [36] "CurrentDelinquencies"
## [37] "AmountDelinquent"
## [38] "DelinquenciesLast7Years"
## [39] "PublicRecordsLast10Years"
## [40] "PublicRecordsLast12Months"
## [41] "RevolvingCreditBalance"
## [42] "BankcardUtilization"
## [43] "AvailableBankcardCredit"
## [44] "TotalTrades"
## [45] "TradesNeverDelinquent..percentage."
## [46] "TradesOpenedLast6Months"
## [47] "DebtToIncomeRatio"
## [48] "IncomeRange"
## [49] "IncomeVerifiable"
## [50] "StatedMonthlyIncome"
## [51] "LoanKey"
## [52] "TotalProsperLoans"
## [53] "TotalProsperPaymentsBilled"
## [54] "OnTimeProsperPayments"
## [55] "ProsperPaymentsLessThanOneMonthLate"
## [56] "ProsperPaymentsOneMonthPlusLate"
## [57] "ProsperPrincipalBorrowed"
## [58] "ProsperPrincipalOutstanding"
## [59] "ScorexChangeAtTimeOfListing"
## [60] "LoanCurrentDaysDelinquent"
## [61] "LoanFirstDefaultedCycleNumber"
## [62] "LoanMonthsSinceOrigination"
## [63] "LoanNumber"
## [64] "LoanOriginalAmount"
## [65] "LoanOriginationDate"
## [66] "LoanOriginationQuarter"
## [67] "MemberKey"
## [68] "MonthlyLoanPayment"
## [69] "LP_CustomerPayments"
## [70] "LP_CustomerPrincipalPayments"
## [71] "LP_InterestandFees"
## [72] "LP_ServiceFees"
## [73] "LP_CollectionFees"
## [74] "LP_GrossPrincipalLoss"
## [75] "LP_NetPrincipalLoss"
## [76] "LP_NonPrincipalRecoverypayments"
## [77] "PercentFunded"
## [78] "Recommendations"
## [79] "InvestmentFromFriendsCount"
## [80] "InvestmentFromFriendsAmount"
## [81] "Investors"
The BorrowerState: The two letter abbreviation of the state of the address of the borrower at the time the Listing was created.
The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, Past Due (1-15 days), Past Due (16-30 days), Past Due (31-60 days), Past Due (61-90 days), and Past Due (91-120 days)
IncomeRange: The income range of the borrower at the time the listing was created.
Term: The length of the loan expressed in months.
ListingCategory: The category of the listing that the borrower selected when posting their listing: 0 - Not Available, 1 - Debt Consolidation, 2 - Home Improvement, 3 - Business, 4 - Personal Loan, 5 - Student Use, 6 - Auto, 7- Other, 8 - Baby&Adoption, 9 - Boat, 10 - Cosmetic Procedure, 11 - Engagement Ring, 12 - Green Loans, 13 - Household Expenses, 14 - Large Purchases, 15 - Medical/Dental, 16 - Motorcycle, 17 - RV, 18 - Taxes, 19 - Vacation, 20 - Wedding Loans
Recomendations: Number of recommendations the borrower had at the time the listing was created.
BorrowerAPR: The Borrower’s Annual Percentage Rate (APR) for the loan. An annual percentage rate (APR) is the annual rate charged for borrowing or earned through an investment. APR is expressed as a percentage that represents the actual yearly cost of funds over the term of a loan.
BrrowerRate: The Borrower’s interest rate for this loan. or intrest rate
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0799 0.2289 0.2925 0.2823 0.3473 0.4135
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0699 0.2005 0.2610 0.2507 0.3099 0.3600
TotalProsperLoans: Number of Prosper loans the borrower at the time they created this listing. This value will be null if the borrower had no prior loans.
StatedMonthlyIncome:The monthly income the borrower stated at the time the listing was created.
IsBorrowerHomeowner: A Borrower will be classified as a homowner if they have a mortgage on their credit profile or provide documentation confirming they are a homeowner.
MonthlyLoanPayment: The scheduled monthly loan payment.
LoanOriginationQuarter:The quarter in which the loan was originated.
IncomeRange: The income range of the borrower at the time the listing was created. The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, PastDue.
and The borrowers with medium income which is between (25,000 USD and 74,999 USD) have the highest loans and I think from my point of view this a large amount of the loan with their monthly income. The relation is when the borrower has high income can take loans and completed on time but when the borrower has low-income range may can’t complete the loan on time.
EmploymentStatus: The employment status of the borrower at the time they posted the listing.
AvailableBankcardCredit: The total available credit via bank card at the time the credit profile was pulled.
ProsperRating..numeric.:The Prosper Rating assigned at the time the listing was created: 0 - N/A, 1 - HR, 2 - E, 3 - D, 4 - C, 5 - B, 6 - A, 7 - AA. Applicable for loans originated after July 2009.
I was very interested in analyzing this dataset. The prosperLoanData is a dataset from Prosper, Prosper was founded in 2005 as the first peer-to-peer lending marketplace in the United States. Since then, Prosper has facilitated more than $14 billion in loans to more than 870,000 people. The prosperLoanData contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information. First I looked to the dataset using (str and summary functions) to get the structure and five number summary of the variables, then I read the variables definitions and some of the variables I searched to more information to explored it, the dataset was contained missing values that need to clean it. for the first section I install needed packages, libraries and remove missing data (NA’s), there are some bugs I faced when coding like when converting between string and numeric formats, convert from numeric to factor and to extract the dates and added it as three separate variables. In Univariate section investigate 13 variables out of 81 and to know more about these variables I plot each of them by visualization plots using (ggplot and geom layers), To remove repetitive codes I create functions that make coding easy. The second section is about the relationship between variables, for example, the relation between Stated monthly income and borrowers occupation. The last section is about the relationship between more than two variables to represent how these variables are related. Before this project, I didn’t know anything about loans of banks and how it works and this makes this project a little difficult to me, I spent many hours for searching about variables and watch videos about loans it was a challenge but I interested in exploring and analyzing this dataset.